the whole data
one country
51,383
87.24
711
84.74
4,241
7.20
66
7.87
1,502
2.55
17
2.03
1,771
3.01
45
5.36
crimination between countries based on genomics pattern
riminant analysis models were constructed to examine how
pattern deviation happened between four countries. The pair-
rimination power between countries was examined, such as the
ation power between USA and India, etc. Each discrimination
as constructed using the Lasso regression algorithm, which is a
near model. Figure 7.19(a) shows the ROC curves from these
It can be seen that all models demonstrated almost perfect
ation power, indicating that the genomics pattern of sequences
r countries may have a significant difference.
(a) (b)
a) The ROC curves of discrimination models constructed for discriminating
attern from one country against the other country based on 3-mer word library.
tmap of rankings of 64 words in six Lasso discrimination models.
e 7.19(b) shows the rankings of 64 3-mers (words) when
ating between sequences from one country against sequences
other country. There were six models in total for this investigation.
tmap shows that different words had different significant